Incremental view maintenance and data lineage tracing in heterogeneous database environments
نویسنده
چکیده
With the increasing amount and diversity of information available on the Internet, there has been a huge growth in information systems that need to integrate data from distributed, heterogeneous data sources. Automed (Automatic Generation of Mediator Tools for Heterogeneous database Integration) is a database transformation and integration system, which is designed to support virtual and materialized integration of schemas expressed in a variety of modelling languages. In previous work of the Automed project [21, 16, 17], a general framework has been developed to support schema transformation and integration in heterogeneous database architectures. The framework consists of a low-level hypergraph based data model (HDM) and a set of primitive schema transformations on HDM schemas. We term the sequence of primitive transformations defined for transforming a schema S1 to a schema S2 a transformation pathway from S1 to S2. That is, a transformation pathway consists of a sequence of primitive transformations. The purpose of my research is to investigate techniques for incremental view maintenance and data lineage tracing for integrated databases which have been formed from heterogeneous source databases via Automed schema transformation pathways (data lineage tracing investigates how data in a data warehouse has been derived from the data sources). My approach is to decompose the processes of incremental view maintenance and data lineage tracing into a sequence of simple steps based on the transformation pathways. I use a functional intermediate query language (IQL) as the query language to implement the algorithms for these two aspects. The remainder of this paper is as follows. Section 2 outlines the related work. Section 3 gives the examples of the IQL language and Automed transformation pathways. Section 4 presents my research questions and research approach. The preliminary ideas and results achieved so far are given in Section 5. Section 6 describes contributions of my research so far and directions of future work.
منابع مشابه
Using Schema Transformation Pathways for Incremental View Maintenance
In heterogeneous data warehousing environments, autonomous data sources are integrated into a materialized integrated database. The schemas of the data sources and the integrated database may be expressed in different modelling languages. It is possible for either the data or the schemas of the data sources to be updated. Incremental view maintenance is one of the problems being addressed in da...
متن کاملStoring auxiliary data for efficient maintenance and lineage tracing of complex views
As views in a data warehouse become more complex, the view maintenance process can become very complicated and potentially very inefficient. Storing auxiliary views in the warehouse can reduce the complexity and improve the efficiency of view maintenance, and the same auxiliary views can help in efficiently answering lineage tracing queries over the warehouse views. In this paper, we study the ...
متن کاملInvestigating a heterogeneous data integration approach for data warehousing
Data warehouses integrate data from remote, heterogeneous, autonomous data sources into a materialised central database. The heterogeneity of these data sources has two aspects, data expressed in different data models, called model heterogeneity, and data expressed within different schemas of the same data model, called schema heterogeneity. AutoMed is an approach to heterogeneous data transfor...
متن کاملافزایش سرعت نگهداری افزایشی دید با استفاده از الگوریتم فاخته
Data warehouse is a repository of integrated data that is collected from various sources. Data warehouse has a capability of maintaining data from various sources in its view form. So, the view should be maintained and updated during changes of sources. Since the increase in updates may cause costly overhead, it is necessary to update views with high accuracy. Optimal Delta Evaluation method is...
متن کاملIncremental Maintenance for Materialized Views over Semistructured Data
Semistructured data is not strictly typed like relational or object-oriented data and may be irregular or incomplete. It often arises in practice, e.g., when heterogeneous data sources are integrated or data is taken from the World Wide Web. Views over semistructured data can be used to lter the data and to restructure (or provide structure to) it. To achieve fast query response time, these vie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002